Title: Pt film grain border detection

author: Nikolay Chehlarov

email: chehlarow@yahoo.com

git: https://github.com/Chehlarov

date: 17.02.2022.

Abstract

This work describes the development of an algorithm for image segmentation. The goal is to detect Pt grain borders from SEM images. Four U-nets are implemented and evaluated. The best U-net type for the task proved to be Spatial Attention U-Net. Hyper parameter tuning is performed to find optimal settings for the model.

Problem definition

Background

Pt films are used in engineering for various purposes. For example for temperature or strain sensing. The distribution of grain sizes is an important property for the product function. Traditional based approaches failed to achieve acceptable results. ML algorithm was developed with good performance (https://github.com/Chehlarov/Machine-Learning/tree/main/00%20-%20project). In production environment photos with different noise level and scale have to be analyzed. The ML approach did not deliver the expected generalization and production readiness. Deep learning approach is expected to meet the production needs.

Goal

Create a model to perform image segmentation of Pt grains from a SEM image. Pixels should be classified as border or not border.

Architecture selection

The task is semantic segmentation, several suitable architecture exist. The difference between grain border segmentation and typical segmentation task is that grain borders are continues curves with small width. Similar standard task is Retinal Vessel Segmentation . Looking into the top performing models, one could easily notice that several models are based on U-net architecture. U-net is also the most implemented paper. U-net types can summarized to:

U-net is supposed to perform well even on small dataset.The available labeled dataset for border segmentation is small and U-net might be suitable solution. A couple of the U-net versions will be implemented and compared:

Spatial Attention U-Net

Highlights:

SA U-net architecture Drawing attention module Drawing

U-Net

Highlights:

U-net architecture Drawing

Attention U-Net

Highlights:

Attention U-net architecture Drawing

Attention gate Drawing

Attention Residual U-net

Highlights:

residual block Drawing

Implementation

The implementation is based on several Github repositories. There are modifications to adapt to the current task. The code is based on tensorflow 2.

Imports

Function for models building and evaluation

This section contains the main building blocks for the U-net models.

Train and test data loading

Train and test datasets are organized in separate folders. The dataset is prepared from one image with scale 500x and a second image 1000x. The images are sliced into patches with 256x176. Augmentation is applied to all patches by applying median filter. The patches are divided so that patch and its augmentation are either in train, validation or test; this is done to avoid information leak. Augmentation could be automated with data generator, but it is possible some augmentation settings to produce unrealistic images. It is strongly recommended to check the visual perception of the augmentation if such approach is considered.

Model selection

Several iterations and hyperparameter searches have been done in the background. The next sections will start with setting known to produce good results.
Objectives:

SA UNet model trials

As a starting point, the following parameters have been found to be optimal:block_size=19, keep_prob=0.8, start_neurons=20

Attention UNet model trials

UNet (classic) model trials

Attention residual UNet model trials

SA Unet tuning - block_size, keep_prob, start_neurons

The results indicate the smallest model, SA UNet gives the best Jacard scores. It is supposed that the small number of channels in the network is the reason for better generalization. Bayesian optimization will be performed.

Top 5 parameters produce similar scores with wide range of parameters (only keep_prob is always on the upper end). Block size in top 3 models varies between 25, 35, 20 producing objective within the noise.

Small trend for increasing validation loss - a sign for overfitting. The val_mae_euclidean (the objective of the tune) is smaller (good), but the jacard is significantly smaller (not good) than model_sa_unet1.

No further improvement achieved with the smaller learning rate. No significant over fitting.

SA Unet tuning - loss function

Focal loss function is commonly used for segmentation task. Below a hyper parameters search is performed to find the best values for gamma and the weight for the positive class. Comparison with dice loss will be made.

The best metric is achieved with gamma=0 and pos_weight=1. This is essentially binary cross entropy. Top 5 models have similar performance.

The binary cross entropy produced similar MAE euclidean and a smaller jacard loss. Loss is approaching 0, limiting the space for improvement. The risk for overfitting with binary cross entropy is higher.

SA Unet tuning - Adam parameters

The best parameters differ from the already used, but the benefits are small/negligible. All top settings use beta_1=0.85, which is on the edge of the search space; lower beta_1 might be beneficial, but not be explored in this work. Let's train and look at the curves.

Comparing the graphs from model_sa_unet4 and model_sa_unet2 gives indication for the effect of different beta_1 and beta_2. All the differences are within the noise of the curves. It would be preferred to use the default beta_1 and beta_2.

Final model selection

The model model_sa_unet1 will be accepted as final. Motivation:

Key implementation details:

The trained model can be loaded from the file "2022-02-17 SA-UNet1 200epochs.hdf5". The custom metrics have to be specified during loading.

Final model analysis

The histograms shows good separation between border and grain pixels. The accepted threshold of 0.5 is good, reasonable changes of the threshold will not yield very different results. Looking at the histogram only, one could notice the classifficator performance is quite poor. However, when the spatial acceptance is added from the picture above, the results look quite promising. It appears that often FP and FN pixel borders are running in parallel - the shift could be due to errors in the ground truth. Some FP are actually quite questionable for the human and some people would classify them as borders.

On the picture above, the zones with biggest error are in red.

Final model evaluation

The model performance is similar on the test and the validation dataset.

Probably the FN pixels could be improved by adding more training data.

Running the model on production images

(no manual labeling available)

Functions needed for production

Production run

The processed images are stored in the original folder.

Backup

Discussion

Comparison to the ML approach using XGBoost https://github.com/Chehlarov/Machine-Learning/tree/main/00%20-%20project

Notes on the dataset:

Lessons learned:

Ideas for improvement:

Conclusions:

References